Brown Dwarf: A Distributed Data Warehouse for the Cloud
نویسندگان
چکیده
In this paper we present the Brown Dwarf, a distributed system designed to efficiently store, query and update multidimensional data over commodity network nodes, without the use of any proprietary tool. Brown Dwarf manages to distribute a highly effective centralized structure among peers on-the-fly, reducing cube creation and query times by enforcing parallelization. Both point and aggregate queries as well as updates are naturally performed on-line through cooperating nodes that hold parts of a fully or partially materialized data cube. The system also employs an adaptive replication scheme that expands or shrinks the units of the distributed data structure for minimal storage consumption against failures and load skew. Brown Dwarf collects many of the features of an application to be deployed in the Cloud: It adapts its resources according to demand, allows for on-line, fast and efficient storage/processing of large amounts of data and is cost-effective both over the required hardware and software components. Our system has been evaluated on both actual and simulation-based testbeds. To outline the findings of our extensive experimentations, Brown Dwarf manages to accelerate cube creation up to 5 times and querying up to several tens of times by exploiting the capabilities of the available network nodes working in parallel. Incurring only a small storage overhead compared to the centralized algorithm, it distributes the structure pretty evenly across the overlay nodes. It manages to quickly adapt even after sudden bursts in load and remains unaffected with a considerable fraction of frequent node failures. These advantages are even more apparent for dense and skewed datacubes and workloads.
منابع مشابه
Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming
The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...
متن کاملCloud formation in substellar atmospheres
Clouds seem like an every-day experience. But – do we know how clouds form on brown dwarfs and extra-solar planets? How do they look like? Can we see them? What are they composed of? Cloud formation is an old-fashioned but still outstanding problem for the Earth atmosphere, and it has turned into a challenge for the modelling of brown dwarf and exo-planetary atmospheres. Cloud formation imposes...
متن کاملLow mass T Tauri and young brown dwarf candidates in the Chamaeleon II dark cloud found by DENIS
We define a sample designed to select low-mass T Tauri stars and young brown dwarfs using DENIS data in the Chamaeleon II molecular cloud. We use a star count method to construct an extinction map of the Chamaeleon II cloud. We select our low-mass T Tauri star and young brown dwarf candidates by their strong infrared color excess in the I − J/J − Ks color-color dereddened diagram. We retain onl...
متن کاملSpectroscopy of Brown Dwarf Candidates in the NGC 1333 Molecular Cloud
We present an analysis of low-resolution infrared spectra for 25 brown dwarf candidates in the NGC 1333 molecular cloud. Candidates were chosen on the basis of their association with the high column density cloud core, and near-infrared fluxes and colors. We compare the depths of water vapor absorption bands in our candidate objects with a grid of dwarf, subgiant, and giant standards to determi...
متن کاملData Replication-Based Scheduling in Cloud Computing Environment
Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...
متن کامل